Morphogenesis by coupled regulatory networks 
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Based on a recently proposed non-equilibrium mechanism for spatial pattern formation we 
study how morphogenesis can be controlled by locally coupled discrete dynamical networks, similar 
to gene regulation networks of cells in a developing multicellular organism. As an example we 
study the developmental problem of domain formation and proportion regulation in the presence 
of noise and cell flow. We find that networks that solve this task exhibit a hierarchical structure 
of information processing and are of similar complexity as developmental circuits of living cells. 
A further focus of this paper is a detailed study of noise-induced dynamics, which is a major 
ingredient of the control dynamics in the developmental network model. A master equation for 
domain boundary readjustments is formulated and solved for the continuum limit. Evidence for a 
first order phase transition in equilibrium domain size at vanishing noise is given by finite size scaling. 
A second order phase transition at increased cell flow is studied in a mean field approximation. 
Finally, we discuss potential applications. 

PACS: 87.f8.Hf, 05.70.Ln, 05.50.+q 
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I. INTRODUCTION 

Understanding the molecular machinery that regu- 
lates development of multicellular organisms is among 
the most fascinating problems of modern science. To- 
day, a growing experimental record about the regula- 
tory mechanisms involved in development is accumulat- 
ing, in particular in well-studied model-organisms as, 
e.g., Drosophila or Hydra 0, Still, the genomic de- 
tails known today are not sufficient to derive dynami- 
cal models of developmental gene regulation processes 
in full detail. Phenomenological models of developmen- 
tal processes, on the other hand, are well established to- 
day. Pioneering work in this field was done by Turing, 
who in his seminal paper in 1952 Q considered a purely 
physico-chemical origin of biological pattern formation. 
His theory is based on an instability in a system of cou- 
pled reaction-diffusion equations. In this type of model, 
for certain parameter choices, stochastic fluctuations in 
the initial conditions can lead to self-organization and 
maintenance of spatial patterns, e.g. concentration gra- 
dients or periodic patterns. This principle has been suc- 
cessfully applied to biological morphogenesis in numer- 
ous applications 0,01 . However, as current experiments 
make us wonder about the astonishingly high complexity 
of single regulating genes in development they also 
seem to suggest that diffusion models will not be able to 
capture all details of developmental regulation, and point 
at a complex network of regulating interactions instead. 

The role of information processing in gene regulatory 
networks during development has entered the focus of 
theoretical research only recently. One pioneering study 
was published by Jackson et al. pj in 1986, who investi- 
gated the dynamics of spatial pattern formation in a sys- 
tem of locally coupled, identical dynamical networks. In 
this model, gene regulatory dynamics is approximated by 
Boolean networks with a subset of nodes communicating 
not only with nodes in the (intracellular) network, but 



also with some nodes in the neighboring cells. Boolean 
networks are minimal models of information processing 
in network structures and have been discussed as mod- 
els of gene regulation since the end of the 1960s 0, 0- 
The model of Jackson et al. demonstrated the enormous 
pattern forming potential of local information process- 
ing, similar to direct contact induction |l(j as known in 
developmental biology. More recently, Salazar-Ciuadad 
et al. introduced a gene network model based on contin- 
uous dynamics [111 [12J and coupling their networks by 
direct contact induction. Interestingly, they observe a 
larger variety of spatial patterns than Turing-type mod- 
els with diffusive morphogens, and find that patterns 
are less sensitive to initial conditions, with more time- 
independent (stationary) patterns. This matches well 
the intuition that networks of regulators have the poten- 
tial for more general dynamical mechanisms than diffu- 
sion driven models. Let us here therefore study interact- 
ing networks in pattern formation and in particular con- 
sider information-transfer-based processes. In [l3| . we 
introduced a simple stochastic cellular automata model 
of spatial pattern formation based on local information 
transfer, that performs de novo pattern formation by gen- 
erating and regulating a domain boundary. In the fol- 
lowing we will study how this very general mechanism 
could emerge as a result of interacting nodes in coupled 
identical networks, similar to gene regulation networks in 
interacting cells. 

The goal of this paper is twofold: First we will explore 
ways to map the earlier cellular automata model |13| . 
onto dynamical networks (Boolean networks and thresh- 
old networks), motivated by the general observation of 
gene regulation networks in biological development (sec- 
tion III). Secondly, we will give a more detailed account of 
the basic underlying mechanism, focusing on the stochas- 
tic dynamics (section IV) and on how the model was de- 
veloped using a genetic algorithm (Appendix) : In section 
IV, the statistical mechanics of noise-induced dynamics 
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is studied in detail. Numerical evidence for a first order 
phase transition at vanishing noise rate is given. A mas- 
ter equation for boundary readjustments is formulated 
and solved in the continuum limit. Robustness against 
cell flow is studied numerically and in a mean field ap- 
proximation. In the Appendix, the genetic algorithm ap- 
plied for identification of model solutions is explained and 
statistical properties of different solutions are compared. 



II. PROBLEM AND MODEL 

Let us first motivate and define the morphogenetic 
problem that we will use as a test case. Then we un- 
dertake a three-step approach to find a genetic network 
model that solves this pattern formation problem. In the 
first step, we summarize the properties of the cellular au- 
tomata model introduced in 13]. Cellular automata as 
dynamical systems discrete in time and state space are 
known to display a wide variety of complex patterns Q 
and are capable of solving complex computational tasks, 
including universal computation. We searched for solu- 
tions (i.e. rule tables which solve the problem) by aid of a 
genetic algorithm (for details, see Appendix). Candidate 
solutions have to fulfill four demands: Their update dy- 
namics has to generate a spatial pattern which 1) obeys 
a predefined scaling ratio a/(l — a), 2) is independent of 
the initial condition chosen at random, 3) is independent 
of the system size (i.e. the number of cells Nc) and 4) is 
stationary (a fixed point). In the second step, this cel- 
lular automata rule table is "translated" into (spatially 
coupled) Boolean networks, using binary coding of the 
cellular automata states. The logical structure of the ob- 
tained network is reduced to a minimal form, and then, 
in step three, translated into a threshold network. 



A. A test scenario: Hydra foot formation 

A classical model organism for studies of position de- 
pendent gene activation is the fresh water polyp Hy- 
dra, which has three distinct body regions - a head with 
mouth and tentacles, a body column and a foot region. 
The positions of these regions are accurately regulated 
along the body axis. 

The problem we shall focus on is sketched schemati- 
cally in Fig. njfor the example of the "foot" genes Pedibin 
and CN-NK2 15] . CN-NK2 is active only in the lowest 
foot region, the expression domain of Pedibin extends a 
bit more towards the rump region, where it is turned off 
at some point, i.e. the domain shows a sharp boundary. 
The relative position of the boundary is almost indepen- 
dent of the animal's size, i.e. the ratio a/(l — a) (as 
denoted in Fig. ^) is almost invariant under changes of 
body size. As the Pedibin/CN-NK2 system presumably 
plays an important role in determining the foot region, 
this invariance appears to be an essential prerequisite 
for maintaining the correct body proportions (propor- 



tion regulation) and to establish the head-foot polarity. 
Such regulation of position information is a quite general 
problem in biological development [l6j . An interesting 
problem is how the specific properties of this regulation 
can be achieved by a small network of regulatory genes 
and if so, whether local communication between the cells 
(networks) is sufficient. This basic question is the cen- 
tral motivation for the present study. In particular, we 
consider the simplified problem of regulating one domain 
as, for example, the foot region versus the rest of the 
body. We consider this as a one dimensional problem as 
first approximation to the well-defined head-foot-axis in 
Hydra. 




head 



rump 



foot 



FIG. 1: Gene expression domains in Hydra, here for the exam- 
ple of the "foot" genes Pedibin (Ped) and Cn-NK2. The Ped- 
ibin domain shows a quite sharp boundary towards the body 
region of the animal. The relative position of the boundary, 
given by the ratio a/(l — a), is independent of the absolute 
size of the animal (proportion regulation). 

Developmental processes exhibit an astonishing ro- 
bustness. This often includes the ability of de novo pat- 
tern formation, e.g., to regenerate a Hydra even after 
complete dissociation of the cell ensemble in a centrifuge 
[13 ■ Further, they are robust in the face of a steady 
cell flux: Hydra cells constantly move from the central 
body region along the body axis towards the top and 
bottom, where they differentiate into the respective cell 
types according to their position on the head-foot axis. 
The global pattern of gene activity is maintained in this 
dynamic environment. Let us take these observations as 
a starting point for a detailed study how the interplay 
of noise-induced regulatory dynamics and cell flow may 
stabilize a developmental system. 

B. One dimensional cellular automata: Definitions 

To define a model system that performs the pattern 
formation task of domain self-organization [l3l ] , consider 
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FIG. 2: Diagram showing the interaction structure of the 
minimal network needed to solve the asymmetric expression 
task. For the sake of clarity, intracellular interactions between 
the two genes Gi and G2 are shown only for cell i, and like- 
wise outgoing intercellular signals from the two genes two the 
neighbor cells i — 1 and i + 1 were left out. The transcription 
factors produced by gene Gi and G2 in cell i — 1 couple to 
the receptor systems Ri and R2 , respectively, whereas in cell 
i + 1 the transcription factors produced by these genes couple 
to the receptor systems i?3 and R4 (biased signaling). In cell 
i, the receptors release factors which regulate the activity of 
Gi and G2. 



a one-dimensional cellular automaton with parallel up- 
date 0|- Nc cells are arranged on a one-dimensional 
lattice, and each cell is labeled uniquely with an index 
i £ {0,1,..., Nc — 1}. Each cell can take n possible states 
<7j € {0,1,.., n}. The state <7j(i) of cell i is a function 
of its own state Oi{t — 1) and of its neighbor's states 
<?i-i(t — 1) and CTi+i(< — 1) at time i—1, i.e. 

a i (t) = f[a^ 1 (t-l),a i (t-l),a i+1 (t-l)} (1) 

with / : {0, 1, n} 3 1— > {0, 1, n} (a cellular automaton 
with neighborhood 3). At the system boundaries, for sim- 
plicity we choose a discrete analogue of zero flux bound- 
ary conditions, i.e. we set cr_i = o-n c +i = const. = 0. 
Other choices, e.g. asymmetric boundaries with cell up- 
date depending only on the inner neighbor cell, lead to 
similar results. The state evolution of course strongly 
depends on the choice of /: for a three-state cellular au- 
tomaton (n = 3), there are 3 27 » 7.626 • 10 12 possible 
update rules, each of which has a unique set of dynam- 
ical attractors. As we will show in the results section, 
n = 3 is the minimal number of states necessary to solve 
the pattern formation problem formulated above. 

Now we can formulate the problem we intend to solve 
as follows: Find a set T of functions (update rules) which, 
given a random initial vector a = (<7o, •••> o~N c -i)i within 
T update steps evolves the system's dynamics to a fixed 
point attractor with the property: 
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i <[a- N c ] 
i>[a- N c ] 



(2) 



where [•] is the Gauss bracket. The scaling parameter 
a may take any value < a < 1. For simplicity, it is 
fixed here to a — 0.3. Notice that a does not depend 



on Nc, i.e. we are looking for a set of solutions where 
the ratio of the domain sizes r := a/{\ — a) is invari- 
ant under changes of the system size. This clearly is a 
non-trivial task when only local information transfer is 
allowed. The ratio r is a global property of the system, 
which has to emerge from purely local (next neighbor) 
interactions between the cell's states. 



Translation into spatially coupled Boolean 
networks 



One can now take a step further towards biological sys- 
tems, by transferring the dynamics we found for a cellu- 
lar automata chain onto cells in a line that communicate 
with each other, similar to biological cells. Identifying 
states with cell types and assuming that the model cells 
have a network of regulators inside, each of them capable 
to reproduce the rules of a cellular automaton, we obtain 
a model mimicking basic properties of a biological genetic 
network in development. 

Cellular automata rule tables can easily be translated 
into logical (Boolean) networks, e.g., for n = 3, two inter- 
nal nodes can be used for binary coding of the cell states. 
One then has 

(4(t),4(t)) = /i(< 2 1 (t-i),ai i2 (t-i),a{+ 1 (i-i)) (3) 

with fi : {0, l} 6 h-> {0, l} 2 . The so obtained rule tables 
are, by application of Boolean logic, transformed into a 
minimized conjunctive normal form, which only makes 
use of the the three logical operators NOT, AND and 
OR with a minimal number of AND operations. This is a 
rather realistic assumption for gene regulatory networks, 
as the AND operation is more difficult to realize on the 
basis of interactions between transcription factors. Other 
logical functions as, e.g., XOR, are even harder to real- 
ize biochemically The structure of the constructed 
network and its biological interpretation is shown in Fig. 
121 Note that the up-down symmetry of the body axis is 
broken locally by a spatially asymmetric receptor distri- 
bution m. 



D. Translation into spatially coupled threshold 
networks 

Perhaps the simplest model for transcriptional reg- 
ulation networks are threshold networks, a subset of 
Boolean networks, where logical functions are modeled by 
weighted sums of the nodes' input states plus a threshold 
h [20I l2l| . They have proven to be valuable tools to ad- 
dress questions associated to the dynamics and evolution 
of gene regulatory networks pi I23L I2I I2I l2rj| . 

Any Boolean network which has been reduced to its 
minimized form (here, the conjunctive normal form), can 
be coded as a dynamical threshold network, with mini- 
mally three hierachies of information processing ( "input 
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layer": signals from the neighbor cells at time t — 1, 
"hidden layer": logical processing of the signals, "out- 
put layer" : states of the two "pattern genes" in cell i at 
time t). The genes' states now may take values cr^ = ±1, 
and likewise for the interaction weights one has Cy = ±1 
for activating and inhibiting regulation, respectively, and 
c\j — if gene i does not receive an input from gene j in 
cell /. The dynamics then is defined as 

oj(t)= sign (/,•(* -1)) (4) 

with 

2 i+1 

/j(i) = EE4'°i + ft < ( 5 ) 

k=l l=i-l 

for the "hidden" genes, where <r k ,k £ {1,2} is the state 
of the fcth output gene in cell / (there are no couplings 
between the genes in the "hidden" layer) . The threshold 
hj is given by 

2 i+1 

kj=EE I4l- 2 ' ( 6 ) 
fc=i i=i-i 

which implements a logical OR operation. For the "out- 
put" (pattern forming) genes, one simply has 

fc*, 

/k(t) = £*«-A&, (7) 
i=i 

i.e. the weights are all set to one, and the (negative) 
threshold equals the number of inputs kf n that gene k 
receives from the "hidden" genes (logical AND). 

III. RESULTS FOR DETERMINISTIC 
DYNAMICS 

A. Cellular Automata Model 



conditions becomes arbitrarily small with increasing sys- 
tem size. Hence, the pattern self-organization in this 
system exhibits considerable robustness against fluctu- 
ations in the initial conditions. The main mechanism 
leading to stabilization at evoo = 0.281 is a modulation of 
the traveling velocity of the right phase boundary in Fig. 
2 such that the boundary on average moves slightly more 
than one cell to the left per update step, whereas the left 
boundary moves one cell to the right exactly every third 
update step, the modulation of the right boundary can 
be seen as the result of interacting phase boundaries rem- 
iniscent of particle interactions. This picture of "particle 
computation" is a useful concept also in various other 
contexts [27j . 




FIG. 3: A typical dynamical run for the Automata as defined 
in Table I, here for a system of size Nc = 250 cells (deter- 
ministic dynamics, no noise), starting from a random initial 
configuration. Time is running on the j/-axis from top to bot- 
tom. Cells with state Oi — are depicted in black color, cells 
with <Tj = 1 in red and cells with cr* = 2 in blue. 



The first major outcome of the cellular automata 
model is that a number of n = 3 different states is nec- 
essary and sufficient for this class of systems to solve the 
given pattern formation task. The update table of the 
fittest solution found during optimization runs, which 
solves the problem independent of system size for about 
98 percent of (randomly chosen) initial conditions (i.e. 
has fitness $ = 0.98), is shown in Table I. Fig. [3] shows 
the typical update dynamics of this solution. The finite 
size scaling of the self-organized relative domain size a as 
a function of the the number of cells Nq is shown in Fig. 
0] In the limit of large system sizes, a converges towards 

a x = 0.281 ± 0.001. (8) 

The variance of a vanishes with a power of Nc, he. the 
relative size of fluctuations induced by different initial 



B. Interaction topology of the minimal network 

In this section let us translate the rule table found in 
the previous section into a (gene) regulatory network. 
We will see that this network has biologically realistic 
properties regarding the number of genes necessary for 
information processing and the complexity of interaction 
structure, making it well conceivable that similar "devel- 
opmental modules" exist in biological systems. 



1. Boolean representation 

The rule table of Table 1 first is translated into binary 
coding, i.e — > 00, 1 — > 01 and 2 — > 10, this corre- 
sponds to two "genes" G\ and G2 one of which (Gi) 
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FIG. 4: Finite size scaling of the self-organized relative do- 
main size a as a function of the total number of cells Nc- In 
the limit of large system sizes, a converges towards a fixed 
value Ooo = 0.281 ± 0.001 (as denoted by the straight line 
fit). The inset shows the finite size scaling of the variance 
Var[a(Nc)]; the straight line in this log-log plot has slope 
— 1.3 and indicates that fluctuations vanish with a power of 
the system size. 
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TABLE I: Rule table of most successful best cellular automata 
solution found during genetic algorithm search. In the left 
column, the rule table index is shown, running from to 26, in 
the middle column the three input states at time t are shown, 
the right column shows the corresponding output states at 
time t + 1 . 



Egf 



I Of 1 

EGL 

Gj 





E^ 
E^rH 



EgO 




Gi+ 



[gFH 

FIG. 5: Boolean representation of the minimal network, min- 
imized conjunctive normal form. G h a with a 6 {1, 2} and 
b 6 {i — 1, i, i+ 1} denotes gene a in cell number 6. The in- 
puts in the left branches of the trees are given by the genes' 
states at time t — 1. — > denotes NOT, denotes logical AND 
and © logical OR. 



involve a higher number of logical sub-processing steps, 
i.e. a higher number of genes, hence we will not discuss 
them here. 

Considering the huge number of possible input con- 
figurations which the outputs theoretically could depend 
on, the complexity of the resulting network is surpris- 
ingly low. As shown in Fig. the output state of gene 
G\ only depends on five different input configurations 
of at maximum four different inputs, gene number two 
on six different input configurations of at maximum four 
different inputs. This indicates that the spatial infor- 
mation flowing into that network is strongly reduced by 
internal information processing (only a small number of 
input states leads to output "1"), as expected for the 
simple stationary target pattern. Nevertheless, this in- 
formation processing is sufficient to solve the non-trivial 
task of domain size scaling. 



is active only in a domain at the left side of the cell 
chain. The so obtained Boolean update table is reduced 
to its minimized conjunctive normal form, using a Quinc- 
McCluskey algorithm |2g. For the construction of the 
network topology we use the conjunctive normal form, 
as it is a somewhat biologically plausible solution with a 
minimal number of logical AND operations. In principle, 
other network topologies, e.g. with more levels of hierar- 
chy, are possible and biologically plausible, however, they 



2. Threshold network implementation 

Alternatively, the Boolean representation of the pat- 
tern formation system can be translated into a three- 
layered threshold network (Fig. H3). The states of the 
genes G\ and Gi at time t in a cell i and its two neigh- 
bor cells i — 1 and i + 1 serve as inputs of 11 information- 
processing genes ("hidden" layer). The state of these 
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FIG. 6: Threshold network realization of the pattern forma- 
tion system. Solid line arrows denote links with my = +1, 
dashed arrows denote links with Wij — —1. The inputs from 
the genes in the neighbor cells (Gp 1 ^ -1 , G\ +1 and G\ +1 ) 
are processed by a layer of "hidden genes" (hollow circles) 
with different thresholds h (as these "genes" perform a log- 
ical OR operation, one has h = ki n — 2). The threshold of 
gene Gi is hi — —5, for gene G2 one has /12 = —6 (logical 
AND). 



genes then defines the state of G\ and G2 in cell i at 
time t + 2 (output layer). Additionally, there is some 
feedback from G\ and G2 to the information process- 
ing layer, as expected for the dependence on cell-internal 
dynamics already present in the cellular automata imple- 
mentation of the model. The resulting stationary spatial 
patterns of gene G\ (the "domain gene"), G2 (active only 
at the domain boundary) and the "hidden" genes after 
system relaxation are shown in Table 2, for a system size 
of Nc = 30 cells. From this seemingly redundant pat- 
tern in equilibrium one hardly guesses the higher level of 
genetic information processing during the self-organizing 
phase. The network we construct here, regarded as a 
"developmental module" defining the head-foot polarity 
through spatially asymmetric gene expression, has a size 
similar to biological modules (compare, for example, the 
segment polarity network in Drosophila 29]) as well as 
similar complexity (average connectivity K ss 3). 



IV. DYNAMICS UNDER NOISE AND CELL 
FLOW 

In the following we will study dynamics and robust- 
ness of the model with respect to noise. Two kinds of 
perturbations frequently occur: Stochastic update errors 
and external forces cased by a directed cell flow due to 
cell proliferations. Both types of perturbations are very 
common during animal development, e.g., in Hydra cells 
continuously move from the central body region along the 
body axis towards the top and bottom, and differentiate 
into the respective cell types along the way according to 
their position on the head-foot axis. 

Let us define stochastic update errors with probability 
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TABLE II: Spatial gene activity patterns of the network 
shown in Fig. 6. (stationary patterns after the system has 
reached equilibrium), Nc ~ 30. The numbers identify the 
"hidden" genes in Fig. HJfrom the left hand side to the right. 
States were transformed by a — > (a + 1) /2. 




FIG. 7: Quasi-particles, started by stochastic update errors, 
lead to control of the boundary position under noise (left 
panel). The F particle (top right) leads to readjustment of 
the boundary two cells to the left, the A particle (bottom 
right) leads to readjustment of the boundary one cell to the 
right. 



p per cell, leading to an average error rate r e = pNc- In- 
terestingly, this stochastic noise starts moving "particle" 
excitations in the cellular automaton which, as a result 
indeed stabilize the developmental structure of the sys- 
tem. To prepare for the details of these effects, define 
first how we measure the boundary position properly in 
the presence of noise. Let us use a statistical method to 
measure the boundary position in order to get conclu- 
sive results also for high p: Starting at i = 0, we put 
a "measuring frame" of size w over cell i and the next 
w — 1 cells, move this frame to the right and, for each i, 
measure the fraction z of cells with state a = 2 within 
the frame. The algorithm stops when z drops below 1/2 
and the boundary position is defined to be i + w/2. 
It is easy to see that, for not too high p. there are 
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only two different quasi-particles (i.e. state perturbations 
moving through the homogeneous phases), as shown in 
Fig. In the following, these particles are called T and 
A. The T particle is started in the 02 phase by a stochas- 
tic error (j{ = 2 — * er^ 7^ 2 at some i < aNc), moves to 
the right and, when reaching the domain boundary, read- 
justs it two cells to the left of its original position. The 
A particle is started in the 00 phase by a stochastic error 
o-j — — > Oj 7^ at some i > aNc and moves to the left. 
Interaction with the domain boundary readjusts it one 
cell to the right. Thus we find that the average position 
a* of the boundary is given by the rate equation 



2a*r e = (1 - a*)i 



(9) 



i.e. a* = 1/3. Interestingly, for not too high error rates 
r e , a* is independent from r e and thus from p. If we 
consider the average boundary position a* as a system- 
specific order parameter which is controlled by the two 
quasi-particles, then comparing the solution of Eqn. @ 
to the equilibrium position in the noiseless case indicates 
that the system undergoes a first order phase transition 
with respect to a* at p = r e = 0. 
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FIG. 9: Average domain boundary position a* as a function 
of the error rate r e , sampled over update windows of different 
lengths s w (ensemble statistics, 400 different initial conditions 
for each data point). The abscissa is logarithmic. With in- 
creasing s w , the transition from the solution a* det — 0.281 un- 
der deterministic dynamics to a* — 1/3 under noise is shifted 
towards r e — 0. The two straight lines define a lower bound- 
ary a* ow and a upper boundary a^ p , as explained in the text. 
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FIG. 8: Average boundary position a* as a function of the 
error rate r e for system sizes Nc ~ 100, Nc = 400, and 
Nc = 1600. The abscissa is logarithmic. Numerical data 
are averaged over 200 different initial conditions with 2 • 10 6 
updates each. The dashed curves show the mean field approx- 
imation given by Eqn. (5), the straight dashed lines mark the 
unperturbed solution a* = 0.281 and the solution under noise, 
a* = 1/3. 

Figs. El and ^| show noise dependence and finite 
size scaling of the transition. In case of a first order 
phase transition at r e = 0, we would expect a shift of 
the transition point r t e rans (s w ) towards r e = which 
is proportional to s^ 1 as well as a divergence of the 
slope at the transition point when s w is increased, i.e. 
da* j dr e (rl rans ) — > 00 when s w — > 00. 

The shift of r t J ans (s w ) is most easily measured by 
defining a lower and a upper boundary a* ow and a* p , 
respectively (Fig. |5J); when a* crosses these boundaries, 



FIG. 10: Finite size scaling of the upper and lower transition 
points r" p and r l f w , i.e. the points where a* crosses af ow and 
a„ p , respectively (Fig. 7), as a function of the sampling win- 
dow length s w . Both r^ p and r l ° w vanish oc s" 1 , as indicated 
by the line with slope —1 in this log-log-plot. 



two transition points r" p and r l ° w are obtained. We find 
that r^ p w CupS" 1 and r l ° w w c^s" 1 with c up > c low 
as expected (Fig. 1100 . which implies that the difference 
Arl rans (s w ) := rf - r l ° w scales as 

Ar t e rans (s w ) = (c up -c low )s-\ (10) 

hence, because Aa*(r'™" s ) = const. — a* tp — a* ow , in- 
deed da* /dr e (r t J' ans ) diverges when the sampling window 
size goes to infinity. 

The solution a* = 1/3 is stable only for < r e < 1/2. 
As shown in Fig.0 the interaction of a T particle with the 
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boundary needs only one update time step, whereas the 
boundary readjustment following a A particle interaction 
takes three update time steps. Hence, we conclude that 
the term on the right hand side of Eqn. (10), which gives 
the flow rate of A particles at the boundary, for large r e 
will saturate at 1/3, leading to 



0.08 



with the solution 



a 



2a*r, 



G 



0(N C ) 



(11) 



(12) 



for r e > 1/2. Hence, there is a crossover from the solution 
a* = 1/3 to another solution vanishing with r~ x around 
r e = 1/2. The finite size scaling term Q(Nc) can be es- 
timated from the following consideration: for p — ► 1, the 
average domain size created by "pure chance" is given by 
a* = Nc 1 Er^=o(l/ 3 )" ' n ~ (3/4) Nc 1 . If the measur- 
ing window has size w, we obtain <d(Nc) ~ (3/4) wN^ 1 . 
To summarize, we find that the self-organized boundary 
position is given by 



0.281 ±0.001 if r e = 

o ■ = { 1/3 if < r e < 1/2 

(1/6K- 1 + G(N C ) if r e > 1/2 



(13) 



with a first-order phase transition at r e — and a 
crossover around r e = 1/2. 

Now let us consider the fluctuations of a around a* 
given by the master equation 

p T (a) = 2ar e p T - 1 (a + 25) + (l~a)r e p T - 1 (a-S) 
+ (N c -r e )p T - 1 (a)-2ar eP T - 1 {a) 
- (l-a)r e p T -\a) (14) 

with 5 = 1/Nc- Eqn. (15) determines the probability 
p T {a) to find the boundary at position a at update time 
step t, given its position at time r — 1. This equation 
can be simplified as we are interested only in the station- 
ary probability distribution of a. It is easy to see that 
the error rate r e just provides a time scale for relaxation 
towards the stationary distribution and has no effect on 
the stationary distribution itself. Therefore, we may con- 
sider the limit r e — > r™ ax := Nc, divide through r e and 
neglect the last three terms on the right handside of Eqn. 
(14) (which become zero in this limit). We obtain 

p T {a) = 2ap T - 1 (a + 25) + (1 - a) p T -\a - 6). (15) 

To study this equation, we consider the continuum limit 
Nc — > oo. Let us introduce the scaling variables x = 
(a — a*)^/Nc, t — t/Nc and the probability density 
f(x,t) = Nc p 1 {a Nc)- Inserting these definitions into 
Eqn. (16) and ignoring all subdominant powers 0{\/Nc) 1 
we obtain a Fokker-Planck equation: 



df(x,t) 
dt 



dx 



0.07 - 

0.06 - 

0.05 - 

0.04 - 

0.03 - 
0.02 
0.01 



1 r 

N c = 400 
analytic result 

Nc = 800 
analytic result 
N c = 1600 
analytic result 



H 



4' 



100000 



50000 





0.25 



55 



0.3 



0.35 



0.4 



0.45 



0.5 



0.55 



FIG. 11: For the system with stochastic update errors, fluctu- 
ations of the boundary position a around the average position 
a* — 1/3 are Gaussian distributed. The figure compares the 
numerically obtained stationary probability distribution with 
the analytic result of Eqn. (18) for three different system sizes. 
All data are gained for r e = 0.1 and averaged over 100 dif- 
ferent initial conditions with 2 ■ 10 6 updates each. The inset 
shows a typical timeseries of the boundary position. 



The stationary solution of this equation is given by 



/(*) 



exp 



(17) 



i.e. in the long time limit t — > oo, the probability density 
for the boundary position a is a Gaussian with mean a* : 



p{a,N c ) = 



3N C 
2tt 



■ exp 



3Nr 



(a 



(18) 



From Eqn. (18) we see that the variance of a vanishes ~ 
1 /Nc and the relative boundary position becomes sharp 
in the limit of large system sizes. Fig. ^2 shows that 
this continuum approximation for Nc > 400 provides 
very good correspondence with the numerically obtained 
probability distributions. 

The stochastic nature of boundary stabilization under 
noise is also reflected by the probability distribution of 
waiting times t for boundary readjustments due to par- 
ticle interactions: the particle production is a Poisson 
process with the parameter A = r e and the waiting time 
distribution is given by 



Pwait(t) = r e exp(-r e t) 



(19) 



with an average waiting time (t) = r~ 2 . Fig. 1121 shows 
the waiting time distributions for different error rates r e . 

In a biological organism, a pattern has to be robust 
not only with respect to dynamical noise, but also with 
respect, e.g., to "mechanical" perturbations. In Hydra, 
e.g., there is a steady flow of cells directed towards the 
animal's head and foot, due to continued proliferation of 
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FIG. 12: Probability distribution p(t) of waiting times for 
boundary readjustements in the model with stochastic up- 
date errors for three different values of r e , semi- log plot. As 
expected for a Poisson process, p(t) is an exponential. 



stem cells [3(j ; the stationary pattern of gene activity is 
maintained is spite of this cell flow. Let us now study 
the robustness of the model with respect to this type of 
perturbation. Let us consider a constant cell flow with 
rate 77, which is directed towards the left or the right 
system boundary. In Eqn. (9), we now get an additional 
drift term 77 on the left hand side: 



2a* 



with the solution 



± 77 = r e (l - a*), 



if r e 



< r f 



(20) 



(21) 



for the case of cell flow directed towards the left system 
boundary (plus sign in eqn. J20JI). One observes that a* 
undergoes a second order phase transition at the critical 
value Tg"* = 77. Below rf zt , the domain size a* vanishes, 
and above rf lt it grows until it reaches the value a* max — 
1 /3 of the system without cell flow. For cell flow directed 
towards the right system boundary (minus sign in eqn. 
(|2T)jl L we obtain 



l 1 i 



if 77 <2r e 
if 77 > 2r e . 



(22) 



In this case, the critical cell flow rate is given by 77 = 2r e , 
for cell flow rates larger than this value the rj2-domain 
extends over the whole system, i.e. a* = 1. Fig. 1131 com- 
pares the results of numerical simulations with the mean 
field approximation of Eqn. (|20|l . In numerical simula- 
tions, cell flow is realized by application of the translation 
operator 9 tJi := to all cells with < i < Nq — 1 ev- 
ery r7 1 time steps and leaving <7jv c _i unchanged. In case 
of cell flow directed to the right system boundary, in the 
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FIG. 13: Average domain size a* as a function of the the cell 
flow rate 77 for three different error rates r e ; numerical data 
(crosses and points) were sampled over 10 different initial con- 
ditions and le6 updates for each data point. "Left" indicates 
cell flow directed to the left system boundary, "right" to the 
right system boundary, respectively. The dashed lines are the 
corresponding solutions of Eqn. I2U1 . 



limit 77 — > 1 the boundary position a* detected in numer- 
ical simulations deviates from the mean field prediction, 
due to a boundary effect at the left system boundary 
(stochastic production of finite lifetime stationary oscil- 
lators, leading to intermittent flows of T particles through 
the system). 

To summarize this part, we see that in the model 
stochastic errors in dynamical updates for r e > rf in- 
deed stabilize the global pattern against the mechanical 
stress of directed cell flow. 



V. SUMMARY AND DISCUSSION 

In this paper we considered the dynamics of pattern 
formation motivated by animal morphogenesis and the 
largely observed participation of complex gene regulation 
networks in their coordination and control. We there- 
fore chose a simple developmental problem to study toy 
models of interacting networks that control pattern for- 
mation and morphogenesis in this setting. In particular, 
the goal was to explore how networks can offer additional 
mechanisms beyond the standard diffusion based process 
of the Turing instability. Our results suggest that main 
functions of morphogenesis can be performed by dynami- 
cal networks without relying on diffusive biochemical sig- 
nals, but using local signaling between neighboring cells. 
This includes solving the problem of generating global 
position information from purely local interactions. But 
also it goes beyond diffusion based models as it offers so- 
lutions to developmental problems that are difficult for 
such models and avoids their inherent problem of fine 
tuned model parameters. It may be not too far fetched 
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to speculate that some developmental processes as, e.g., 
the establishment of position information, may rely on 
this type of internal information processing rather than 
on interpretation of, e.g., chemical gradients. As only in- 
gredient to the present model, symmetry has to be bro- 
ken locally by a spatially asymmetric receptor distribu- 
tion, e.g., as proposed in |l9j . to provide the necessary 
input information. Also hybrid approaches are conceiv- 
able. For example such broken symmetry could be pro- 
vided by a gradient, even without fine-tuning (contrary 
to standard gradient-based models) since only the direc- 
tion of the gradient has to be read out. 

The network model derived here performs accurate reg- 
ulation of position information and robust de novo pat- 
tern formation from random conditions, with a mecha- 
nism based on local information transfer rather than the 
Turing instability. Non-local information is transmitted 
through soliton-likc quasi-particles instead of long-range 
gradients. Two realizations as discrete dynamical net- 
works, Boolean networks and threshold networks, have 
been developed. The resulting networks have size and 
complexity comparable to developmental gene regulation 
modules as observed in animals, e.g., Drosophila [2{| or 
Hydra [||. The threshold networks (as models for tran- 
scriptional regulation networks) process position infor- 
mation in a hierarchical manner; in the present study, 
hierarchy levels were limited to three, but realizations 
with more levels of hierarchy, i.e. more "pre-processing" 
of information are also possible. Similar hierarchical and 
modular organization are typical signatures of gene reg- 
ulatory networks in organisms |l8j . 

Robustness of the model is studied in detail for two 
types of perturbations, stochastic update errors (noise) 
and directed cell flow. A first order phase transition is 
observed for vanishing noise and a second order phase 
transition at increasing cell flow. Fluctuations of the 
noise-controlled boundary position were studied numeri- 
cally for finite size systems and analytically in the contin- 
uum limit. We find that the relative size of fluctuations 
vanishes with 1/Nc, which means that the boundary po- 
sition becomes sharp in the limit of large system sizes. 
Dynamics under cell flow is studied in detail numerically 
and analytically by a mean field approximation. A ba- 
sic observation is that noise-induced perturbations act as 
quasi-particles that stabilize the pattern against the di- 
rected force of cell flow. At a critical cell flow rate, there 
is a second order phase transition towards a vanishing do- 
main size or a domain extending over the whole system, 
depending on the direction of cell flow, respectively. 

Several extensions of this model are conceivable. In 
the present model, the cell flow rate r/ is considered as 
a free parameter, the global pattern, however, can be 
controlled easily by an appropriate choice of the error 
rate r e . This may suggest to extend the model by in- 
troduction of some kind of dynamical coupling between 
r e and r/, treating rj as a function of r e . Interestingly, 
similar approaches have been studied by Hogeweg [31| 
and Furusawa and Kaneko [33, 1331 : In both models of 



morphogenesis, the rate of cell divisions is controlled by 
cell differentiation and cell-to-cell signaling. Dynamics in 
both models, however, is deterministic. An extension of 
our model as outlined above may open up for interest- 
ing studies how stochastic signaling events could control 
and stabilize a global expression pattern and cell flow as 
an integrated system. Other possible extensions of the 
model concern the dimensionality: In two or three di- 
mensions other mechanisms of symmetry breaking might 
be present, possibly leading to new, interesting dynami- 
cal effects. 

A Java applet simulation of the model can be found at 

M. 
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APPENDIX A: GENETIC ALGORITHM 
SEARCHES 

Let us briefly recapitulate here how the rule table of 
the model has been obtained by the aid of a genetic al- 
gorithm. 



1. Definition of the GA 

In order to find a set T of update rules that solve the 
problem as formulated in section II, cellular automata 
have been evolved using genetic algorithms [35[. Ge- 
netic algorithms are population-based search algorithms, 
which are inspired by the interplay of random mutations 
and selection as observed in biological evolution [36| . 
Starting from a randomly generated population of P rule 
tables /„, the algorithm optimizes possible solutions by 
evaluating the fitness function 



*(/») = 



1 



(T u -T)-N c 



T u /[a-N c ]-l 

E E ^ 



t=T u -T \ i=0 



Nc-l \ 

+ E {1-^(0,2} -(Al) 

i=[a-N c ] J 

The optimization algorithm then is defined as follows: 

1. Generate a random initial population T = 
{/i, fp} of rule tables. 



2. Randomly assign system sizes N™ m < iV" < N ( 
to all rule tables. 



max 
C 
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3. For each rule table, generate a random initial state 
vector. 

4. Randomly mutate one entry of each rule table (gen- 
crating a population T* of mutants). 

5. Iterate dynamics over T u time steps for T and J-*. 

6. Evaluate 4>(/n) an d ®(fn) f° r au rme tables < 
n < P, averaging over the past T u — T update steps 
(with an additional penalty term if /„ does not con- 
verge to a fixed point). 

7. For each n, replace /„ with /*, if $(/„) < $(/*). 

8. Replace the least fit solution by a duplicate of the 
fittest one. 

9. Go back to step 2 and iterate. 

The outcome of this search algorithm is a set of rule 
tables, which then can be "translated" into (spatially 
coupled) Boolean networks or threshold networks with 
suitable thresholds. This yields a set of (minimal) dy- 
namical networks which solve the pattern formation task 
by means of internal information processing. 
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FIG. 14: Average fitness (<&)(t) of the mutant population T* 
and fitness of the highest fitness mutant <&f, es t(t) as a func- 
tion of simulation time during the genetic algorithm run that 
lead to the high fitness solution used in this paper. At time 
step 10000 mutations were turned off, in order to test the es- 
tablished population of optimized rule tables under different 
initial conditions (this corresponds to the sharp increase of 
($)(t) at time step 10000). The evolved population of rule 
tables has an average fitness of about 0.98, independent from 
the initial conditions and system size Nc (in the tested range, 
i.e. 15 < N c < 150). 



2. Evolution of cellular automata 

The genetic algorithm sketched above is run with the 
following parameter choices: 15 < Nc < 150, i.e. during 
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FIG. 15: Average fitness of the highest fitness rule table as 
a function of the system size Nc- For system sizes Nc > 
80 the fitness is almost constant at about 0.98. Notice that 
the decrease of the fitness for small Nc is an effect of the 
dynamics, not of the genetic algorithm implementation (all 
Nc in the range 15 < Nc < 150 were tested with equal 
probability), hence the dynamics of pattern formation may 
impose a lower boundary on the range of system (animal) 
sizes tolerated by natural selection. 



GA runs the system size is varied randomly between 15 
and 150 cells, and the population size is set to P = 100. 
Fig. El shows the fitness of the highest fitness mutant 
and the average fitness of the population as a function 
of the number of successive mutation steps during opti- 
mization. A useful solution is found rather quickly (after 
about 200 updates), with further optimization observed 
during futher 10000 generations. At time step 10000 mu- 
tations are turned off, thus now the average fitness of the 
established population under random initial conditions 
and random fluctuations of the system size Nc is tested. 
The average fitness 4> « 0.98 indicates a surprisingly high 
robustness against fluctuations in the initial start pat- 
tern, indicating that the system is capable of de novo 
pattern formation. In the "fitness picture", Fig. 1151 con- 
firms that the dynamically regulated domain size ratio 
a/(l — a) indeed is independent of system size (propor- 
tion regulation) , there is only a weak decay of the fitness 
at small values of Nc- Interestingly, one also observes 
that for regeneration of Hydra polyps from random cell 
aggregates a minimum number of cells is required 0]. 
The model suggests that this observation might be ex- 
plained by the dynamics of an underlying pattern gen- 
erating mechanism, i.e. that there has to be a minimum 
diversity in the initial condition for successful de novo 
pattern formation. 
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FIG. 16: Frequency distribution p((Ji) of outputs as a func- 
tion of the rule table index as denoted in table 1. Ensemble 
statistics is taken over 80 different solutions with $ > 0.96. 
The upper panel shows the distribution for Oi — 0, the mid- 
dle panel the distribution for Oi — 1 and the lower panel the 
distribution for a% — 2. 

APPENDIX B: STATISTICAL ANALYSIS OF 
SOLUTIONS 



this point, we studied statistical two point correlations 
between the rule table entries. The probability for find- 
ing state a at position a and state a 1 at rule table position 
b is given by 



p ab (a, a') = — <W),a • <W)V, (Bl) 



where S is the Kronecker symbol and n runs over the 
statistical ensemble of size Ne ■ The two point correlation 
between a and b then is defined as 



C ab = ci ( max p ab (cr, a') - c 2 ) (B2) 

\( <T . cr ') / 



An interesting question is how "difficult" it would be 
for an evolutionary process driven by random mutations 
and selection to find solutions for the pattern formation 
problem based on neighbor interactions between cells. As 
we showed above, the genetic algorithm finds the correct 
solution fast, however, this does not necessarily mean 
that biological evolution could access the same solution 
as fast. If there is only one, singular solution, evolu- 
tion may never succeed finding it, as the genotype which 
already exists cannot be modified in an arbitrary way 
without possibly destroying function of the organism {de- 
velopmental constrains). To illustrate this point, we gen- 
erated an ensemble of Ne = 80 different solutions with 
$ > 0.96 and performed a statistical analysis of the rule 
table structure. 

As one can see in Fig. 1161 some positions in the rule 
table are quite fixed, i.e., there is not much variety in 
the outputs, whereas other positions are more variable. 
However, the output frequency distribution alone does 
not allow to really judge the "evolvability" of the solu- 
tions: if there are strong correlations between most of the 
rule table entries, evolutionary transitions from one so- 
lution to another would be almost impossible. To check 



0.6 - 



with ci = 9/8 and c% = 1/9 to obtain a proper normaliza- 
tion with respect to the two limiting cases of equal proba- 
bilities (p ab {(J, a') = 1/9 V(er, a')) and p ab (a 7 a') = 1 for 
a = a, a = a and p ab (<r, a') = for all other (a, cr')). 
Fig. 1171 shows the frequency distribution of C ab (a,a') : 
averaged over all possible pairs (a, b). About 65% of rule 
table positions are strongly correlated (C ab = 1.0), the 
rest shows correlation values between 0.3 and 1.0. Hence, 
we find that the space of solutions is restricted, never- 
theless there is variability in several rule table positions. 
To summarize this aspect, the pattern formation mecha- 
nism studied in this paper shows considerable robustness 
against rule mutations, however, a "core module" of rules 
is always fixed. Interestingly, a similar phenomenon is ob- 
served in developmental biology: Regulatory "modules" 
involved in developmental processes often are evolution- 
aryly very conservative, i.e., they are shared by almost 
all animal phyla 01 > while morphological variety is cre- 
ated by (few) taxon specific genes |6j and "rewiring" of 
existing developmental modules. 
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